SentTopic-MultiRank: a Novel Ranking Model for Multi-Document Summarization

نویسندگان

  • Wenpeng Yin
  • Yulong Pei
  • Fan Zhang
  • Lian'en Huang
چکیده

Extractive multi-document summarization is mostly treated as a sentence ranking problem. Existing graph-based ranking methods for key-sentence extraction usually attempt to compute a global importance score for each sentence under a single relation. Motivated by the fact that both documents and sentences can be presented by a mixture of semantic topics detected by Latent Dirichlet Allocation (LDA), we propose SentTopic-MultiRank, a novel ranking model for multi-document summarization. It assumes various topics to be heterogeneous relations, then treats sentence connections in multiple topics as a heterogeneous network, where sentences and topics/relations are effectively linked together. Next, the iterative algorithm of MultiRank is carried out to determine the importance of sentences and topics simultaneously. Experimental results demonstrate the effectiveness of our model in promoting the performance of both generic and query-biased multi-document summarization tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WordTopic-MultiRank: A New Method for Automatic Keyphrase Extraction

Automatic keyphrase extraction aims to pick out a set of terms as a representation of a document without manual assignment efforts. Supervised and unsupervised graph-based ranking methods have been studied for this task. However, previous methods usually computed importance scores of words under the assumption of single relation between words. In this work, we propose WordTopic-MultiRank as a n...

متن کامل

Multi-layered graph-based multi-document summarization model

Multi-document summarization is a process of automatic generation of a compressed version of the given collection of documents. Recently, the graph-based models and ranking algorithms have been actively investigated by the extractive document summarization community. While most work to date focuses on homogeneous connecteness of sentences and heterogeneous connecteness of documents and sentence...

متن کامل

An Exploration of Document Impact on Graph-Based Multi-Document Summarization

The graph-based ranking algorithm has been recently exploited for multi-document summarization by making only use of the sentence-to-sentence relationships in the documents, under the assumption that all the sentences are indistinguishable. However, given a document set to be summarized, different documents are usually not equally important, and moreover, different sentences in a specific docum...

متن کامل

Decayed DivRank for Guided Summarization

Guided summarization is essentially an aspect-based multi-document summarization, where aspects can be taken as specified queries in summarization. We proposed a novel ranking algorithm, Decayed DivRank (DDRank) for guided summarization tasks of TAC2011. DDRank can address relevance, importance, diversity, and novelty simultaneously through a decayed vertex-reinforced random walk process in sen...

متن کامل

Multi-Document Summarization via Discriminative Summary Reranking

Existing multi-document summarization systems usually rely on a specific summarization model (i.e., a summarization method with a specific parameter setting) to extract summaries for different document sets with different topics. However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012